Skip to main content

Polymorphic Data

Overview

Polymorphism in TimeBase is the ability of a single stream to contain multiple message types. See Basic Concepts and Messages to learn more.

In the following example we query tickquerydemo stream for all records. As the result, we get messages of different types: TradeMessage and BestBidOfferMessage.

#Shell format

==> SELECT * FROM tickquerydemo

>deltix.timebase.api.messages.TradeMessage,TIMESTAMP,SYMBOL,TYPE,price,size
0,2011-10-17 17:21:41,XBANK,EQUITY,301.25,800.0
>deltix.timebase.api.messages.BestBidOfferMessage,TIMESTAMP,SYMBOL,TYPE,offerPrice,offerSize,bidPrice,bidSize
1,2011-10-17 17:21:42,XBANK,EQUITY,301.5,60000.0,298.5,800.0
2,2011-10-17 17:21:43,GREATCO,EQUITY,45.0,100.0,43.0,400.0
3,2011-10-17 17:21:43,XBANK,EQUITY,299.5,40000.0,295.0,300.0

Create Polymorphic Output

RECORD

You can use the RECORD ... TYPE ... WHEN construction to return a polymorphic data set, where TYPE sets a new type, and WHEN sets a condition.

#This query returns a polymorphic dataset with fields of both classes TradeMessage and BestBidOfferMessage
WITH
entry AS L1Entry AS l1
SELECT
RECORD
entry.price AS "price",
entry.size AS "size"
TYPE "TradeMessage"
WHEN entry IS KrakenTradeEntry
RECORD
l1[side == ASK].price AS "offerPrice",
l1[side == ASK].size AS "offerSize",
l1[side == BID].price AS "bidPrice",
l1[side == BID].size AS "bidSize"
TYPE "BestBidOfferMessage"
WHEN entry IS L1Entry
FROM kraken
ARRAY JOIN entries AS entry

UNION

You can also produce a polymorphic data set using the UNION construction.

info

Refer to UNION to learn more about this feature.

Examples with UNION keyword
#UNION of these two queries will return a polymorphic dataset with fields of both classes TradeMessage and BestBidOfferMessage
SELECT
trade.price AS "price",
trade.size AS "size"
TYPE "TradeMessage"
FROM kraken
ARRAY JOIN entries[THIS IS KrakenTradeEntry] AS trade

UNION

SELECT
bbo[side == ASK].price AS "offerPrice",
bbo[side == ASK].size AS "offerSize",
bbo[side == BID].price AS "bidPrice",
bbo[side == BID].size AS "bidSize"
TYPE "BestBidOfferMessage"
FROM kraken
ARRAY JOIN (entries AS array(L1entry))[THIS IS NOT NULL] AS bbo

Polymorphic Union Model and GROUP BY SYMBOL

Selecting * means simply selecting the current message - alone. Selecting any other combination of fields implies the construction of a new message type.

For example, we can try and select the combination of price, bidPrice, offerPrice. Even though these fields are picked from two different message types, this is a legal request. Its meaning is "give me the last price and quote values as of the time of each event". The result of such a query is not very useful in our scenario, since it mixes the data of different symbols:

Select from different classes
#Shell format

SELECT price, bidPrice, offerPrice FROM tickquerydemo

>_,TIMESTAMP,SYMBOL,TYPE,price,bidPrice,offerPrice
0,2011-10-17 17:21:40,GREATCO,EQUITY,_,42.5,43.5
1,2011-10-17 17:21:40,XBANK,EQUITY,_,301.25,301.75
2,2011-10-17 17:21:41,XBANK,EQUITY,301.25,_,_
3,2011-10-17 17:21:42,XBANK,EQUITY,_,298.5,301.5
4,2011-10-17 17:21:43,GREATCO,EQUITY,_,43.0,45.0
5,2011-10-17 17:21:43,XBANK,EQUITY,_,295.0,299.5
6,2011-10-17 17:21:44,GREATCO,EQUITY,44.0,_,_
!end

While executing this query, the QQL engine creates an object called Query State, containing, in our example, three fields: price, bidPrice, offerPrice Initially these fields are set to NULL. As each message comes into the query engine, the corresponding fields are updated. As we know, the first underlying message is: [from the select * query]. This first message causes the first update to the Query State, which we see in the output:

#Shell output format
>_,TIMESTAMP,SYMBOL,TYPE,price,bidPrice,offerPrice
0,2011-10-17 17:21:40,GREATCO,EQUITY,_,42.5,43.5

As you see, price is still NULL, because we have not seen a TradeMessage yet. The next message in the source data set is another BestBidOfferMessage, this time for XBANK. This new message again updates the bidPrice and offerPrice fields of the Query State, while price is still NULL. The third message (a TradeMessage) finally updates the price field, and so on. In QQL terminology, the Query State object represents a Polymorphic Union Model (PUM) of the source data.

The following diagram illustrates the flow of data in the query we just ran:

As we mentioned, such a query is useless on market data, as it mixes price from one symbol with quote values from another, as seen in messages #4 and #6:

#Shell format
4,2011-10-17 17:21:43,GREATCO,EQUITY,301.25,43.0,45.0
6,2011-10-17 17:21:44,GREATCO,EQUITY,44.0,295.0,299.5

What we want instead is to track the Query State independently for each symbol. Fortunately, there is a very simple way to make the QQL engine behave the way we want, by specifying the GROUP BY SYMBOL clause. While in relational SQL the GROUP BY clause is only valid when aggregate values are being selected (such as MAX(), COUNT(), etc.), in QQL this clause has the more general effect of determining how many Query States are created, and how they are indexed.

#Shell format
SELECT price, bidPrice, offerPrice FROM tickquerydemo GROUP BY SYMBOL

>_,TIMESTAMP,SYMBOL,TYPE,price,bidPrice,offerPrice
0,2011-10-17 19:21:44,GREATCO,EQUITY,44.0,_,_
1,2011-10-17 19:21:43,XBANK,EQUITY,_,295.0,299.5
!end

Without the GROUP BY SYMBOL clause, there is only one Query State object for all symbols. When the GROUP BY SYMBOL clause is present, there is a separate Query State object message created for each distinct symbol. Each new source message updates the fields of the corresponding Query State:

Adding GROUP BY SYMBOL to our polymorphic query instantly makes the result look the way we expect.